Module 1 - Dogs: loglinear model for binary dataΒΆ
Background: Solomon-Wynne in 1953 conducted an experiment on avoidance learning in dogs from traumatic experiences in past such as those from electric shocks. The apparatus of experiment holds a dog in a closed compartment with steel flooring, open on side with a small barrier for dog to jump over to the other side. A high-voltage electric shock is discharged into the steel floor intermittently to stimulate the dog; Thus the dog is effectively left with an option to either get the shock for that trial or jump over the barrier to other side & save himself. Several dogs were subjected to similar experiment for many consecutive trials. This picture elicits the appratus
The elaborate details of the experiment can be found at http://www.appstate.edu/~steelekm/classes/psy5300/Documents/Solomon&Wynne 1953.pdf
The hypothesis is that most of the dogs learnt to avoid shocks by jumping over barrier to the other side after suffering the trauma of shock in previous trials. That inturn sustain dogs in future encounters with electric shocks.
Since the experiment aims to study the avoidance learning in dogs from past traumatic experiences and reach a plausible model where dogs learn to avoid scenerios responsible for causing trauma, we describe the phenomenon using expression $\( \pi_j = A^{xj} B^{j-xj} \)$ Where :
\(\pi_j\) is the probability of a dog getting shocked at trial \(j\)
A & B both are random variables drawing values from Normal distribution
\(x_j\) is number of successful avoidances of shock prior to trial \(j\).
\(j-x_j\) is number of shocks experienced prior to trial \(j\). In the subsequent
The hypothesis is thus corroborated by Bayesian modelling and comprehensive analysis of dogs data available from Solomon-Wynne experiment in Pyro.
The data is analysed step by step in accordance with Bayesian workflow as described in βBayesian Workflowβ, Prof. Andrew Gelman [http://www.stat.columbia.edu/~gelman/research/unpublished/Bayesian_Workflow_article.pdf].
Import following dependencies.
import os
import torch
import pyro
import random
import time
import numpy as np
import pandas as pd
import re
import itertools
import seaborn as sns
import matplotlib.pyplot as plt
from functools import partial
from collections import defaultdict
from scipy import stats
import pyro.distributions as dist
from torch import nn
from pyro.nn import PyroModule
from pyro.infer import MCMC, NUTS
import plotly
import plotly.express as px
import plotly.figure_factory as ff
pyro.set_rng_seed(1)
# Uncomment following if pandas available with plotly backend
# pd.options.plotting.backend = "plotly"
%matplotlib inline
plt.style.use('default')
1. Model Specification: Dogs Model definitionΒΆ
We intend to find most probable values for parameters \(\alpha\) & \(\beta\) (dubbed as random variable A & B respectively) in the expression to compute likelihood (\(\pi_j\)) of dogs getting shocked.
Generative model for resulting likelihood of shock:
\(\pi_j\) ~ \(bern\ (\exp \ (\alpha.XAvoidance + \beta.XShocked)\ )\), \(prior\ \alpha\) ~ \(N(0., 316.)\), \(\beta\) ~ \(N(0., 316.)\)
The above expression is used as a generalised linear model with log-link function in WinBugs implementation
BUGS model
\(\log\pi_j = \alpha\ x_j + \beta\ ( \)j\(-x_j )\)
Here
\(\log\pi_j\) is log probability of a dog getting shocked at trial \(j\)
\(x_j\) is number of successful avoidances of shock prior to trial \(j\).
\(j-x_j\) is number of shocks experienced prior to trial \(j\).
\(\alpha\) is the coefficient corresponding to number of success, \(\beta\) is the coefficient corresponding to number of failures.
The same model when implemented in PyStan
Equivalent Stan model
{
alpha ~ normal(0.0, 316.2);
beta ~ normal(0.0, 316.2);
for(dog in 1:Ndogs)
for (trial in 2:Ntrials)
y[dog, trial] ~ bernoulli(exp(alpha * xa[dog, trial] + beta *
xs[dog, trial]));
}
Model implementation
The model is defined using Pyro as per the expression of generative model for this dataset as follows
# Dogs model with normal prior
def DogsModel(x_avoidance, x_shocked, y):
"""
Input
-------
x_avoidance: tensor holding avoidance count for all dogs & all trials, example for
30 dogs & 25 trials, shaped (30, 25)
x_shocked: tensor holding shock count for all dogs & all trials, example for 30 dogs
& 25 trials, shaped (30, 25).
y: tensor holding response for all dogs & trials, example for 30 dogs
& 25 trials, shaped (30, 25).
Output
--------
Implements pystan model: {
alpha ~ normal(0.0, 316.2);
beta ~ normal(0.0, 316.2);
for(dog in 1:Ndogs)
for (trial in 2:Ntrials)
y[dog, trial] ~ bernoulli(exp(alpha * xa[dog, trial] + beta * xs[dog, trial]));}
"""
alpha = pyro.sample("alpha", dist.Normal(0., 316.))
beta = pyro.sample("beta", dist.Normal(0., 316))
with pyro.plate("data"):
pyro.sample("obs", dist.Bernoulli(torch.exp(alpha*x_avoidance + beta * x_shocked)), obs=y)
Dogs data
Following holds the Dogs data in the pystan modelling format
dogs_data = {"Ndogs":30,
"Ntrials":25,
"Y": np.array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0,
0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 0, 1, 1, 1, 0, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0,
1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 1, 1, 1, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 1,
1, 0, 0, 1, 1, 1, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1, 0, 1, 1, 0, 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 0, 1,
0, 1, 0, 1, 0, 0, 1, 1, 0, 1, 0, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 1, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 0, 1, 0,
0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1, 0, 1, 1,
1, 1, 0, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]).reshape((30,25))}
Following processes target label y to obtain input data x_avoidance & x_shocked where:
x_avoidance: number of shock avoidances before current trial.x_shocked: number of shocks before current trial.
def transform_data(Ndogs=30, Ntrials=25, Y= np.array([0, 0, 0, 0])):
"""
Input
-------
Ndogs: Total number of Dogs i.e., 30
Ntrials: Total number of Trials i.e., 25
Y: Raw responses from data, example: np.array([0, 0, 0, 0])
Outputs
---------
xa: tensor holding avoidance count for all dogs & all trials
xs: tensor holding shock count for all dogs & all trials
y: tensor holding response observation for all dogs & all trials
"""
y= np.zeros((Ndogs, Ntrials))
xa= np.zeros((Ndogs, Ntrials))
xs= np.zeros((Ndogs, Ntrials))
for dog in range(Ndogs):
for trial in range(1, Ntrials+1):
xa[dog, trial-1]= np.sum(Y[dog, :trial-1]) #Number of successful avoidances uptill previous trial
xs[dog, trial-1]= trial -1 - xa[dog, trial-1] #Number of shocks uptill previous trial
for dog in range(Ndogs):
for trial in range(Ntrials):
y[dog, trial]= 1- Y[dog, trial]
xa= torch.tensor(xa, dtype=torch.float)
xs= torch.tensor(xs, dtype=torch.float)
y= torch.tensor(y, dtype=torch.float)
return xa, xs, y
Here the py-stan format data (python dictionary) is passed to the function above, in order to preprocess it to tensor format required for pyro sampling
x_avoidance, x_shocked, y= transform_data(**dogs_data)
print("x_avoidance: %s, x_shocked: %s, y: %s"%(x_avoidance.shape, x_shocked.shape, y.shape))
print("\nSample x_avoidance: %s \n\nSample x_shocked: %s"%(x_avoidance[1], x_shocked[1]))
x_avoidance: torch.Size([30, 25]), x_shocked: torch.Size([30, 25]), y: torch.Size([30, 25])
Sample x_avoidance: tensor([0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
1., 1., 1., 1., 2., 2., 3.])
Sample x_shocked: tensor([ 0., 1., 2., 3., 4., 5., 6., 7., 8., 8., 9., 10., 11., 12.,
13., 14., 15., 16., 17., 18., 19., 20., 20., 21., 21.])
2. Prior predictive checkingΒΆ
These checks help to understand the implications of a prior distributions of underlying parameters (random variables) in the context of a given generative model by simulating from the model rather than observed data.
priors_list= [(pyro.sample("alpha", dist.Normal(0., 316.)).item(),
pyro.sample("beta", dist.Normal(0., 316.)).item()) for index in range(1100)]# Picking 1100 prior samples
prior_samples = {"alpha":list(map(lambda prior_pair:prior_pair[0], priors_list)), "beta":list(map(lambda prior_pair:prior_pair[1], priors_list))}
Sampled output of prior values for alpha & beta is stored in prior_samples above, and is plotted on a KDE plot as follows:
fig = ff.create_distplot(list(prior_samples.values()), list(prior_samples.keys()))
fig.update_layout(title="Prior distribution of parameters", xaxis_title="parameter values", yaxis_title="density", legend_title="parameters")
fig.show()
print("Prior alpha Q(0.5) :%s | Prior beta Q(0.5) :%s"%(np.quantile(prior_samples["alpha"], 0.5), np.quantile(prior_samples["beta"], 0.5)))
Prior alpha Q(0.5) :2.7043710947036743 | Prior beta Q(0.5) :-15.45756196975708
3. Posterior estimationΒΆ
In the parlance of probability theory, Posterior implies the probability of updated beliefs in regard to a quantity or parameter of interest, in the wake of given evidences and prior information.
For the parameters of interest \(\alpha,\beta\) & evidence y; Posterior can be denoted as \(P\ (\alpha,\beta\ /\ y)\).
Posterior, \(P\ (\alpha,\beta\ /\ y)\) in regard to this experiment is the likelihood of parameter values (i.e., Coefficient of instances avoided dubbed retention ability & Coefficient of instances shocked dubbed learning ability) given the observed instances y of getting shocked. Where \(P(\alpha,\beta)\) is prior information/likelihood of parameter values.
The following intakes a pyro model object with defined priors, input data and some configuration in regard to chain counts & chain length prior to launching a MCMC NUTs sampler and outputs MCMC chained samples in a python dictionary format.
def get_hmc_n_chains(pyromodel, xa, xs, y, num_chains=4, base_count = 900):
"""
Input
-------
pyromodel: Pyro model object with specific prior distribution
xa: tensor holding avoidance count for all dogs & all trials
xs: tensor holding shock count for all dogs & all trials
y: tensor holding response observation for all dogs & all trials
num_chains: Count of MCMC chains to launch, default 4
base_count:Minimum count of samples in a MCMC chains , default 900
Ouputs
---------
hmc_sample_chains: a dictionary with chain names as keys & dictionary of parameter vs sampled values list as values
"""
hmc_sample_chains =defaultdict(dict)
possible_samples_list= random.sample(list(np.arange(base_count, base_count+num_chains*100, 50)), num_chains)
possible_burnin_list= random.sample(list(np.arange(100, 500, 50)), num_chains)
t1= time.time()
for idx, val in enumerate(list(zip(possible_samples_list, possible_burnin_list))):
num_samples, burnin= val[0], val[1]
nuts_kernel = NUTS(pyromodel)
mcmc = MCMC(nuts_kernel, num_samples=num_samples, warmup_steps=burnin)
mcmc.run(xa, xs, y)
hmc_sample_chains['chain_{}'.format(idx)]={k: v.detach().cpu().numpy() for k, v in mcmc.get_samples().items()}
print("\nTotal time: ", time.time()-t1)
hmc_sample_chains= dict(hmc_sample_chains)
return hmc_sample_chains
hmc_sample_chains= get_hmc_n_chains(DogsModel, x_avoidance, x_shocked, y, num_chains=4, base_count = 900)
Warmup: 0%| | 0/1150 [00:00, ?it/s]
Warmup: 1%| | 10/1150 [00:00, 97.85it/s, step size=3.58e-03, acc. prob=0.637]
Warmup: 2%|β | 20/1150 [00:00, 87.91it/s, step size=1.38e-03, acc. prob=0.697]
Warmup: 3%|β | 40/1150 [00:00, 123.95it/s, step size=4.71e-04, acc. prob=0.729]
Warmup: 5%|β | 53/1150 [00:00, 108.06it/s, step size=4.97e-04, acc. prob=0.742]
Warmup: 6%|β | 64/1150 [00:00, 107.44it/s, step size=1.92e-03, acc. prob=0.758]
Warmup: 7%|β | 75/1150 [00:00, 103.78it/s, step size=2.26e-03, acc. prob=0.763]
Warmup: 8%|β | 97/1150 [00:00, 134.69it/s, step size=1.75e-03, acc. prob=0.767]
Warmup: 10%|β | 112/1150 [00:00, 135.60it/s, step size=8.38e-02, acc. prob=0.763]
Warmup: 11%|β | 126/1150 [00:01, 117.19it/s, step size=1.06e-01, acc. prob=0.767]
Warmup: 12%|ββ | 140/1150 [00:01, 122.65it/s, step size=1.69e-01, acc. prob=0.770]
Sample: 13%|ββ | 153/1150 [00:01, 118.01it/s, step size=1.43e-01, acc. prob=0.945]
Sample: 15%|ββ | 171/1150 [00:01, 132.49it/s, step size=1.43e-01, acc. prob=0.859]
Sample: 16%|ββ | 187/1150 [00:01, 138.21it/s, step size=1.43e-01, acc. prob=0.900]
Sample: 18%|ββ | 203/1150 [00:01, 143.03it/s, step size=1.43e-01, acc. prob=0.863]
Sample: 19%|ββ | 218/1150 [00:01, 143.56it/s, step size=1.43e-01, acc. prob=0.886]
Sample: 20%|ββ | 235/1150 [00:01, 150.69it/s, step size=1.43e-01, acc. prob=0.900]
Sample: 22%|βββ | 251/1150 [00:01, 147.09it/s, step size=1.43e-01, acc. prob=0.910]
Sample: 23%|βββ | 266/1150 [00:02, 144.31it/s, step size=1.43e-01, acc. prob=0.913]
Sample: 25%|βββ | 286/1150 [00:02, 155.96it/s, step size=1.43e-01, acc. prob=0.912]
Sample: 26%|βββ | 302/1150 [00:02, 150.09it/s, step size=1.43e-01, acc. prob=0.912]
Sample: 28%|βββ | 318/1150 [00:02, 136.04it/s, step size=1.43e-01, acc. prob=0.918]
Sample: 29%|βββ | 332/1150 [00:02, 133.82it/s, step size=1.43e-01, acc. prob=0.922]
Sample: 30%|βββ | 346/1150 [00:02, 131.34it/s, step size=1.43e-01, acc. prob=0.924]
Sample: 32%|ββββ | 366/1150 [00:02, 149.52it/s, step size=1.43e-01, acc. prob=0.922]
Sample: 33%|ββββ | 382/1150 [00:02, 132.78it/s, step size=1.43e-01, acc. prob=0.922]
Sample: 35%|ββββ | 397/1150 [00:03, 135.34it/s, step size=1.43e-01, acc. prob=0.924]
Sample: 36%|ββββ | 411/1150 [00:03, 134.95it/s, step size=1.43e-01, acc. prob=0.926]
Sample: 37%|ββββ | 428/1150 [00:03, 143.55it/s, step size=1.43e-01, acc. prob=0.922]
Sample: 39%|ββββ | 443/1150 [00:03, 136.90it/s, step size=1.43e-01, acc. prob=0.925]
Sample: 40%|ββββ | 457/1150 [00:03, 136.93it/s, step size=1.43e-01, acc. prob=0.924]
Sample: 41%|ββββ | 473/1150 [00:03, 143.03it/s, step size=1.43e-01, acc. prob=0.924]
Sample: 42%|βββββ | 488/1150 [00:03, 130.78it/s, step size=1.43e-01, acc. prob=0.925]
Sample: 44%|βββββ | 505/1150 [00:03, 137.24it/s, step size=1.43e-01, acc. prob=0.924]
Sample: 45%|βββββ | 522/1150 [00:03, 145.89it/s, step size=1.43e-01, acc. prob=0.924]
Sample: 47%|βββββ | 540/1150 [00:03, 150.75it/s, step size=1.43e-01, acc. prob=0.926]
Sample: 48%|βββββ | 556/1150 [00:04, 149.47it/s, step size=1.43e-01, acc. prob=0.928]
Sample: 50%|βββββ | 572/1150 [00:04, 145.45it/s, step size=1.43e-01, acc. prob=0.929]
Sample: 52%|ββββββ | 595/1150 [00:04, 165.93it/s, step size=1.43e-01, acc. prob=0.929]
Sample: 53%|ββββββ | 612/1150 [00:04, 159.89it/s, step size=1.43e-01, acc. prob=0.930]
Sample: 55%|ββββββ | 629/1150 [00:04, 160.85it/s, step size=1.43e-01, acc. prob=0.927]
Sample: 56%|ββββββ | 646/1150 [00:04, 146.85it/s, step size=1.43e-01, acc. prob=0.927]
Sample: 57%|ββββββ | 661/1150 [00:04, 132.59it/s, step size=1.43e-01, acc. prob=0.927]
Sample: 59%|ββββββ | 676/1150 [00:04, 136.28it/s, step size=1.43e-01, acc. prob=0.929]
Sample: 60%|ββββββ | 690/1150 [00:05, 134.00it/s, step size=1.43e-01, acc. prob=0.930]
Sample: 62%|βββββββ | 708/1150 [00:05, 142.16it/s, step size=1.43e-01, acc. prob=0.931]
Sample: 63%|βββββββ | 725/1150 [00:05, 148.68it/s, step size=1.43e-01, acc. prob=0.931]
Sample: 64%|βββββββ | 741/1150 [00:05, 139.21it/s, step size=1.43e-01, acc. prob=0.932]
Sample: 66%|βββββββ | 757/1150 [00:05, 144.30it/s, step size=1.43e-01, acc. prob=0.932]
Sample: 67%|βββββββ | 772/1150 [00:05, 140.82it/s, step size=1.43e-01, acc. prob=0.933]
Sample: 68%|βββββββ | 787/1150 [00:05, 141.07it/s, step size=1.43e-01, acc. prob=0.931]
Sample: 70%|βββββββ | 802/1150 [00:05, 136.52it/s, step size=1.43e-01, acc. prob=0.930]
Sample: 71%|βββββββ | 816/1150 [00:05, 135.06it/s, step size=1.43e-01, acc. prob=0.931]
Sample: 72%|ββββββββ | 833/1150 [00:06, 142.30it/s, step size=1.43e-01, acc. prob=0.931]
Sample: 74%|ββββββββ | 852/1150 [00:06, 153.47it/s, step size=1.43e-01, acc. prob=0.931]
Sample: 75%|ββββββββ | 868/1150 [00:06, 149.36it/s, step size=1.43e-01, acc. prob=0.930]
Sample: 77%|ββββββββ | 884/1150 [00:06, 135.98it/s, step size=1.43e-01, acc. prob=0.930]
Sample: 78%|ββββββββ | 900/1150 [00:06, 140.57it/s, step size=1.43e-01, acc. prob=0.931]
Sample: 80%|ββββββββ | 915/1150 [00:06, 141.05it/s, step size=1.43e-01, acc. prob=0.931]
Sample: 81%|ββββββββ | 930/1150 [00:06, 138.08it/s, step size=1.43e-01, acc. prob=0.931]
Sample: 82%|βββββββββ | 945/1150 [00:06, 139.30it/s, step size=1.43e-01, acc. prob=0.931]
Sample: 84%|βββββββββ | 961/1150 [00:06, 144.40it/s, step size=1.43e-01, acc. prob=0.932]
Sample: 85%|βββββββββ | 976/1150 [00:07, 145.79it/s, step size=1.43e-01, acc. prob=0.932]
Sample: 86%|βββββββββ | 991/1150 [00:07, 140.44it/s, step size=1.43e-01, acc. prob=0.931]
Sample: 87%|βββββββββ | 1006/1150 [00:07, 143.00it/s, step size=1.43e-01, acc. prob=0.932]
Sample: 89%|βββββββββ | 1021/1150 [00:07, 137.24it/s, step size=1.43e-01, acc. prob=0.930]
Sample: 90%|βββββββββ | 1035/1150 [00:07, 134.18it/s, step size=1.43e-01, acc. prob=0.931]
Sample: 91%|ββββββββββ| 1051/1150 [00:07, 137.39it/s, step size=1.43e-01, acc. prob=0.931]
Sample: 93%|ββββββββββ| 1065/1150 [00:07, 137.65it/s, step size=1.43e-01, acc. prob=0.930]
Sample: 94%|ββββββββββ| 1079/1150 [00:07, 133.62it/s, step size=1.43e-01, acc. prob=0.930]
Sample: 95%|ββββββββββ| 1093/1150 [00:07, 128.78it/s, step size=1.43e-01, acc. prob=0.930]
Sample: 96%|ββββββββββ| 1107/1150 [00:08, 129.31it/s, step size=1.43e-01, acc. prob=0.929]
Sample: 97%|ββββββββββ| 1120/1150 [00:08, 126.28it/s, step size=1.43e-01, acc. prob=0.929]
Sample: 99%|ββββββββββ| 1133/1150 [00:08, 123.71it/s, step size=1.43e-01, acc. prob=0.930]
Sample: 100%|ββββββββββ| 1150/1150 [00:08, 137.80it/s, step size=1.43e-01, acc. prob=0.929]
Warmup: 0%| | 0/1350 [00:00, ?it/s]
Warmup: 0%| | 6/1350 [00:00, 59.05it/s, step size=6.25e-04, acc. prob=0.575]
Warmup: 1%| | 12/1350 [00:00, 33.84it/s, step size=8.64e-04, acc. prob=0.699]
Warmup: 1%|β | 19/1350 [00:00, 44.43it/s, step size=1.57e-04, acc. prob=0.703]
Warmup: 2%|β | 32/1350 [00:00, 65.59it/s, step size=2.25e-04, acc. prob=0.740]
Warmup: 3%|β | 40/1350 [00:00, 69.76it/s, step size=3.94e-04, acc. prob=0.755]
Warmup: 4%|β | 48/1350 [00:00, 47.65it/s, step size=2.18e-04, acc. prob=0.755]
Warmup: 4%|β | 54/1350 [00:01, 49.22it/s, step size=2.24e-04, acc. prob=0.758]
Warmup: 5%|β | 61/1350 [00:01, 52.98it/s, step size=2.96e-04, acc. prob=0.764]
Warmup: 5%|β | 67/1350 [00:01, 46.10it/s, step size=4.39e-04, acc. prob=0.768]
Warmup: 5%|β | 73/1350 [00:01, 39.93it/s, step size=1.97e-04, acc. prob=0.765]
Warmup: 6%|β | 78/1350 [00:02, 22.96it/s, step size=4.22e-05, acc. prob=0.756]
Warmup: 6%|β | 82/1350 [00:02, 15.44it/s, step size=1.05e-04, acc. prob=0.763]
Warmup: 6%|β | 85/1350 [00:02, 16.40it/s, step size=3.64e-04, acc. prob=0.772]
Warmup: 7%|β | 93/1350 [00:02, 24.58it/s, step size=5.25e-04, acc. prob=0.775]
Warmup: 7%|β | 101/1350 [00:02, 33.18it/s, step size=2.60e-02, acc. prob=0.763]
Warmup: 8%|β | 107/1350 [00:03, 25.13it/s, step size=5.95e-03, acc. prob=0.764]
Warmup: 8%|β | 112/1350 [00:03, 28.35it/s, step size=1.23e-02, acc. prob=0.768]
Warmup: 9%|β | 117/1350 [00:03, 30.33it/s, step size=7.26e-03, acc. prob=0.767]
Warmup: 9%|β | 122/1350 [00:03, 23.38it/s, step size=8.58e-04, acc. prob=0.762]
Warmup: 9%|β | 126/1350 [00:04, 12.08it/s, step size=4.08e-03, acc. prob=0.767]
Warmup: 10%|β | 131/1350 [00:04, 15.57it/s, step size=7.39e-02, acc. prob=0.776]
Warmup: 10%|β | 137/1350 [00:05, 18.71it/s, step size=3.39e-03, acc. prob=0.768]
Warmup: 10%|β | 141/1350 [00:05, 18.48it/s, step size=2.91e-02, acc. prob=0.774]
Warmup: 11%|β | 147/1350 [00:05, 24.13it/s, step size=5.99e-03, acc. prob=0.771]
Warmup: 11%|β | 151/1350 [00:05, 25.85it/s, step size=2.87e-02, acc. prob=0.775]
Warmup: 12%|ββ | 159/1350 [00:05, 35.46it/s, step size=8.67e-03, acc. prob=0.773]
Warmup: 12%|ββ | 164/1350 [00:05, 37.52it/s, step size=1.31e-02, acc. prob=0.774]
Warmup: 13%|ββ | 176/1350 [00:06, 30.92it/s, step size=5.43e-03, acc. prob=0.773]
Warmup: 13%|ββ | 181/1350 [00:06, 33.40it/s, step size=1.62e-02, acc. prob=0.777]
Warmup: 14%|ββ | 186/1350 [00:06, 36.13it/s, step size=8.86e-03, acc. prob=0.775]
Warmup: 14%|ββ | 192/1350 [00:06, 34.67it/s, step size=1.87e-02, acc. prob=0.778]
Warmup: 16%|ββ | 210/1350 [00:06, 61.87it/s, step size=6.00e-03, acc. prob=0.779]
Warmup: 17%|ββ | 227/1350 [00:06, 84.92it/s, step size=3.12e-02, acc. prob=0.782]
Warmup: 18%|ββ | 243/1350 [00:06, 101.87it/s, step size=9.25e-03, acc. prob=0.781]
Sample: 20%|ββ | 269/1350 [00:06, 140.31it/s, step size=1.19e-02, acc. prob=0.814]
Sample: 22%|βββ | 293/1350 [00:07, 166.20it/s, step size=1.19e-02, acc. prob=0.894]
Sample: 23%|βββ | 314/1350 [00:07, 176.60it/s, step size=1.19e-02, acc. prob=0.892]
Sample: 25%|βββ | 341/1350 [00:07, 198.68it/s, step size=1.19e-02, acc. prob=0.883]
Sample: 27%|βββ | 362/1350 [00:07, 191.48it/s, step size=1.19e-02, acc. prob=0.889]
Sample: 28%|βββ | 382/1350 [00:07, 182.06it/s, step size=1.19e-02, acc. prob=0.892]
Sample: 30%|βββ | 401/1350 [00:07, 183.31it/s, step size=1.19e-02, acc. prob=0.886]
Sample: 31%|βββ | 421/1350 [00:07, 185.86it/s, step size=1.19e-02, acc. prob=0.886]
Sample: 33%|ββββ | 440/1350 [00:07, 186.45it/s, step size=1.19e-02, acc. prob=0.894]
Sample: 34%|ββββ | 459/1350 [00:07, 169.70it/s, step size=1.19e-02, acc. prob=0.896]
Sample: 35%|ββββ | 477/1350 [00:08, 172.15it/s, step size=1.19e-02, acc. prob=0.898]
Sample: 37%|ββββ | 495/1350 [00:08, 169.96it/s, step size=1.19e-02, acc. prob=0.903]
Sample: 38%|ββββ | 516/1350 [00:08, 177.38it/s, step size=1.19e-02, acc. prob=0.906]
Sample: 40%|ββββ | 537/1350 [00:08, 185.65it/s, step size=1.19e-02, acc. prob=0.908]
Sample: 41%|ββββ | 556/1350 [00:08, 181.36it/s, step size=1.19e-02, acc. prob=0.903]
Sample: 43%|βββββ | 575/1350 [00:08, 176.48it/s, step size=1.19e-02, acc. prob=0.904]
Sample: 44%|βββββ | 600/1350 [00:08, 194.71it/s, step size=1.19e-02, acc. prob=0.905]
Sample: 46%|βββββ | 620/1350 [00:08, 194.20it/s, step size=1.19e-02, acc. prob=0.906]
Sample: 47%|βββββ | 640/1350 [00:08, 177.63it/s, step size=1.19e-02, acc. prob=0.906]
Sample: 49%|βββββ | 662/1350 [00:09, 188.73it/s, step size=1.19e-02, acc. prob=0.907]
Sample: 51%|βββββ | 682/1350 [00:09, 183.60it/s, step size=1.19e-02, acc. prob=0.905]
Sample: 52%|ββββββ | 702/1350 [00:09, 188.11it/s, step size=1.19e-02, acc. prob=0.906]
Sample: 54%|ββββββ | 723/1350 [00:09, 187.36it/s, step size=1.19e-02, acc. prob=0.907]
Sample: 55%|ββββββ | 745/1350 [00:09, 195.14it/s, step size=1.19e-02, acc. prob=0.909]
Sample: 57%|ββββββ | 765/1350 [00:09, 189.94it/s, step size=1.19e-02, acc. prob=0.907]
Sample: 58%|ββββββ | 788/1350 [00:09, 199.92it/s, step size=1.19e-02, acc. prob=0.907]
Sample: 60%|ββββββ | 812/1350 [00:09, 210.09it/s, step size=1.19e-02, acc. prob=0.907]
Sample: 62%|βββββββ | 834/1350 [00:09, 201.28it/s, step size=1.19e-02, acc. prob=0.906]
Sample: 64%|βββββββ | 859/1350 [00:10, 208.76it/s, step size=1.19e-02, acc. prob=0.905]
Sample: 65%|βββββββ | 880/1350 [00:10, 195.87it/s, step size=1.19e-02, acc. prob=0.905]
Sample: 67%|βββββββ | 900/1350 [00:10, 188.29it/s, step size=1.19e-02, acc. prob=0.903]
Sample: 68%|βββββββ | 922/1350 [00:10, 196.07it/s, step size=1.19e-02, acc. prob=0.903]
Sample: 70%|βββββββ | 946/1350 [00:10, 208.22it/s, step size=1.19e-02, acc. prob=0.901]
Sample: 72%|ββββββββ | 968/1350 [00:10, 194.52it/s, step size=1.19e-02, acc. prob=0.901]
Sample: 73%|ββββββββ | 991/1350 [00:10, 202.70it/s, step size=1.19e-02, acc. prob=0.903]
Sample: 75%|ββββββββ | 1012/1350 [00:10, 198.78it/s, step size=1.19e-02, acc. prob=0.903]
Sample: 77%|ββββββββ | 1034/1350 [00:10, 202.08it/s, step size=1.19e-02, acc. prob=0.904]
Sample: 78%|ββββββββ | 1055/1350 [00:11, 196.36it/s, step size=1.19e-02, acc. prob=0.903]
Sample: 80%|ββββββββ | 1075/1350 [00:11, 196.57it/s, step size=1.19e-02, acc. prob=0.904]
Sample: 82%|βββββββββ | 1102/1350 [00:11, 216.29it/s, step size=1.19e-02, acc. prob=0.903]
Sample: 83%|βββββββββ | 1124/1350 [00:11, 214.77it/s, step size=1.19e-02, acc. prob=0.904]
Sample: 85%|βββββββββ | 1146/1350 [00:11, 213.75it/s, step size=1.19e-02, acc. prob=0.901]
Sample: 87%|βββββββββ | 1169/1350 [00:11, 217.65it/s, step size=1.19e-02, acc. prob=0.902]
Sample: 88%|βββββββββ | 1193/1350 [00:11, 221.27it/s, step size=1.19e-02, acc. prob=0.901]
Sample: 90%|βββββββββ | 1216/1350 [00:11, 223.12it/s, step size=1.19e-02, acc. prob=0.902]
Sample: 92%|ββββββββββ| 1239/1350 [00:11, 205.17it/s, step size=1.19e-02, acc. prob=0.902]
Sample: 93%|ββββββββββ| 1260/1350 [00:12, 197.88it/s, step size=1.19e-02, acc. prob=0.902]
Sample: 95%|ββββββββββ| 1281/1350 [00:12, 191.98it/s, step size=1.19e-02, acc. prob=0.903]
Sample: 96%|ββββββββββ| 1301/1350 [00:12, 185.17it/s, step size=1.19e-02, acc. prob=0.903]
Sample: 98%|ββββββββββ| 1320/1350 [00:12, 185.56it/s, step size=1.19e-02, acc. prob=0.901]
Sample: 99%|ββββββββββ| 1339/1350 [00:12, 183.77it/s, step size=1.19e-02, acc. prob=0.902]
Sample: 100%|ββββββββββ| 1350/1350 [00:12, 107.72it/s, step size=1.19e-02, acc. prob=0.902]
Warmup: 0%| | 0/1300 [00:00, ?it/s]
Warmup: 0%| | 6/1300 [00:00, 40.18it/s, step size=3.56e-04, acc. prob=0.469]
Warmup: 1%| | 11/1300 [00:00, 16.93it/s, step size=2.32e-04, acc. prob=0.613]
Warmup: 1%|β | 18/1300 [00:00, 25.15it/s, step size=5.88e-04, acc. prob=0.698]
Warmup: 2%|β | 27/1300 [00:00, 38.78it/s, step size=7.89e-04, acc. prob=0.730]
Warmup: 4%|β | 49/1300 [00:00, 77.11it/s, step size=2.53e-03, acc. prob=0.765]
Warmup: 5%|β | 61/1300 [00:01, 86.69it/s, step size=1.85e-03, acc. prob=0.767]
Warmup: 6%|β | 76/1300 [00:01, 102.71it/s, step size=1.34e-03, acc. prob=0.769]
Warmup: 7%|β | 89/1300 [00:01, 107.72it/s, step size=1.67e-03, acc. prob=0.773]
Warmup: 8%|β | 103/1300 [00:01, 110.69it/s, step size=7.12e-02, acc. prob=0.759]
Warmup: 9%|β | 115/1300 [00:01, 90.14it/s, step size=1.88e-01, acc. prob=0.767]
Warmup: 10%|β | 127/1300 [00:01, 96.77it/s, step size=3.26e-01, acc. prob=0.771]
Warmup: 11%|β | 138/1300 [00:01, 99.65it/s, step size=7.20e-02, acc. prob=0.768]
Warmup: 12%|ββ | 153/1300 [00:01, 112.14it/s, step size=1.41e-01, acc. prob=0.766]
Warmup: 13%|ββ | 165/1300 [00:02, 89.22it/s, step size=3.13e-01, acc. prob=0.770]
Warmup: 14%|ββ | 176/1300 [00:02, 93.63it/s, step size=2.73e-01, acc. prob=0.771]
Warmup: 14%|ββ | 187/1300 [00:02, 94.68it/s, step size=2.53e-01, acc. prob=0.772]
Warmup: 16%|ββ | 202/1300 [00:02, 106.51it/s, step size=1.69e-01, acc. prob=0.773]
Warmup: 17%|ββ | 225/1300 [00:02, 138.93it/s, step size=1.64e-01, acc. prob=0.775]
Warmup: 18%|ββ | 240/1300 [00:02, 137.54it/s, step size=2.42e-01, acc. prob=0.777]
Warmup: 20%|ββ | 255/1300 [00:02, 139.59it/s, step size=2.17e-01, acc. prob=0.777]
Warmup: 21%|ββ | 272/1300 [00:02, 146.92it/s, step size=2.15e-01, acc. prob=0.778]
Warmup: 22%|βββ | 288/1300 [00:02, 138.56it/s, step size=2.74e-01, acc. prob=0.780]
Warmup: 23%|βββ | 304/1300 [00:03, 142.68it/s, step size=1.84e-01, acc. prob=0.780]
Warmup: 25%|βββ | 321/1300 [00:03, 148.29it/s, step size=1.64e-01, acc. prob=0.780]
Warmup: 26%|βββ | 339/1300 [00:03, 155.63it/s, step size=2.54e-01, acc. prob=0.782]
Warmup: 27%|βββ | 355/1300 [00:03, 147.90it/s, step size=3.74e-01, acc. prob=0.780]
Warmup: 29%|βββ | 372/1300 [00:03, 149.72it/s, step size=3.36e-01, acc. prob=0.781]
Warmup: 30%|βββ | 390/1300 [00:03, 157.86it/s, step size=5.88e-01, acc. prob=0.782]
Sample: 32%|ββββ | 411/1300 [00:03, 170.45it/s, step size=4.13e-01, acc. prob=0.827]
Sample: 34%|ββββ | 436/1300 [00:03, 191.78it/s, step size=4.13e-01, acc. prob=0.857]
Sample: 35%|ββββ | 457/1300 [00:03, 196.81it/s, step size=4.13e-01, acc. prob=0.847]
Sample: 37%|ββββ | 478/1300 [00:04, 197.98it/s, step size=4.13e-01, acc. prob=0.869]
Sample: 39%|ββββ | 505/1300 [00:04, 218.30it/s, step size=4.13e-01, acc. prob=0.838]
Sample: 41%|ββββ | 528/1300 [00:04, 219.23it/s, step size=4.13e-01, acc. prob=0.840]
Sample: 42%|βββββ | 552/1300 [00:04, 219.52it/s, step size=4.13e-01, acc. prob=0.844]
Sample: 44%|βββββ | 578/1300 [00:04, 230.72it/s, step size=4.13e-01, acc. prob=0.847]
Sample: 47%|βββββ | 607/1300 [00:04, 247.02it/s, step size=4.13e-01, acc. prob=0.853]
Sample: 49%|βββββ | 632/1300 [00:04, 238.97it/s, step size=4.13e-01, acc. prob=0.862]
Sample: 50%|βββββ | 656/1300 [00:04, 236.18it/s, step size=4.13e-01, acc. prob=0.857]
Sample: 52%|ββββββ | 681/1300 [00:04, 238.59it/s, step size=4.13e-01, acc. prob=0.861]
Sample: 54%|ββββββ | 705/1300 [00:05, 215.21it/s, step size=4.13e-01, acc. prob=0.862]
Sample: 56%|ββββββ | 727/1300 [00:05, 212.14it/s, step size=4.13e-01, acc. prob=0.864]
Sample: 58%|ββββββ | 749/1300 [00:05, 202.43it/s, step size=4.13e-01, acc. prob=0.865]
Sample: 59%|ββββββ | 772/1300 [00:05, 207.14it/s, step size=4.13e-01, acc. prob=0.856]
Sample: 61%|ββββββ | 795/1300 [00:05, 208.44it/s, step size=4.13e-01, acc. prob=0.863]
Sample: 63%|βββββββ | 820/1300 [00:05, 218.80it/s, step size=4.13e-01, acc. prob=0.861]
Sample: 65%|βββββββ | 847/1300 [00:05, 230.36it/s, step size=4.13e-01, acc. prob=0.860]
Sample: 67%|βββββββ | 873/1300 [00:05, 237.27it/s, step size=4.13e-01, acc. prob=0.861]
Sample: 69%|βββββββ | 897/1300 [00:05, 224.29it/s, step size=4.13e-01, acc. prob=0.860]
Sample: 71%|βββββββ | 923/1300 [00:05, 232.73it/s, step size=4.13e-01, acc. prob=0.861]
Sample: 73%|ββββββββ | 947/1300 [00:06, 234.76it/s, step size=4.13e-01, acc. prob=0.860]
Sample: 75%|ββββββββ | 973/1300 [00:06, 241.35it/s, step size=4.13e-01, acc. prob=0.861]
Sample: 77%|ββββββββ | 998/1300 [00:06, 232.20it/s, step size=4.13e-01, acc. prob=0.862]
Sample: 79%|ββββββββ | 1023/1300 [00:06, 236.60it/s, step size=4.13e-01, acc. prob=0.861]
Sample: 81%|ββββββββ | 1047/1300 [00:06, 235.31it/s, step size=4.13e-01, acc. prob=0.863]
Sample: 83%|βββββββββ | 1074/1300 [00:06, 240.96it/s, step size=4.13e-01, acc. prob=0.860]
Sample: 85%|βββββββββ | 1099/1300 [00:06, 231.91it/s, step size=4.13e-01, acc. prob=0.861]
Sample: 86%|βββββββββ | 1123/1300 [00:06, 224.33it/s, step size=4.13e-01, acc. prob=0.862]
Sample: 88%|βββββββββ | 1146/1300 [00:06, 210.13it/s, step size=4.13e-01, acc. prob=0.861]
Sample: 90%|βββββββββ | 1168/1300 [00:07, 208.71it/s, step size=4.13e-01, acc. prob=0.860]
Sample: 92%|ββββββββββ| 1194/1300 [00:07, 221.08it/s, step size=4.13e-01, acc. prob=0.861]
Sample: 94%|ββββββββββ| 1219/1300 [00:07, 227.64it/s, step size=4.13e-01, acc. prob=0.860]
Sample: 96%|ββββββββββ| 1244/1300 [00:07, 232.75it/s, step size=4.13e-01, acc. prob=0.860]
Sample: 98%|ββββββββββ| 1268/1300 [00:07, 233.65it/s, step size=4.13e-01, acc. prob=0.861]
Sample: 99%|ββββββββββ| 1292/1300 [00:07, 233.21it/s, step size=4.13e-01, acc. prob=0.860]
Sample: 100%|ββββββββββ| 1300/1300 [00:07, 170.38it/s, step size=4.13e-01, acc. prob=0.859]
Warmup: 0%| | 0/1600 [00:00, ?it/s]
Warmup: 1%|β | 20/1600 [00:00, 199.28it/s, step size=2.51e-03, acc. prob=0.742]
Warmup: 2%|β | 40/1600 [00:00, 156.42it/s, step size=1.44e-03, acc. prob=0.761]
Warmup: 4%|β | 58/1600 [00:00, 165.33it/s, step size=7.90e-04, acc. prob=0.765]
Warmup: 5%|β | 75/1600 [00:00, 144.77it/s, step size=9.34e-04, acc. prob=0.771]
Warmup: 6%|β | 98/1600 [00:00, 166.99it/s, step size=1.11e-03, acc. prob=0.776]
Warmup: 7%|β | 118/1600 [00:00, 176.75it/s, step size=1.19e-01, acc. prob=0.770]
Warmup: 9%|β | 143/1600 [00:00, 198.60it/s, step size=2.12e-01, acc. prob=0.775]
Warmup: 10%|β | 164/1600 [00:00, 201.51it/s, step size=1.82e-01, acc. prob=0.772]
Warmup: 12%|ββ | 185/1600 [00:01, 179.07it/s, step size=5.05e-01, acc. prob=0.776]
Warmup: 13%|ββ | 204/1600 [00:01, 173.27it/s, step size=1.57e-01, acc. prob=0.775]
Warmup: 14%|ββ | 227/1600 [00:01, 184.60it/s, step size=6.74e-02, acc. prob=0.775]
Warmup: 16%|ββ | 251/1600 [00:01, 198.36it/s, step size=2.17e-01, acc. prob=0.779]
---------------------------------------------------------------------------
KeyboardInterrupt Traceback (most recent call last)
<ipython-input-9-f5e72061dc04> in <module>
----> 1 hmc_sample_chains= get_hmc_n_chains(DogsModel, x_avoidance, x_shocked, y, num_chains=4, base_count = 900)
<ipython-input-8-fc034b644bd5> in get_hmc_n_chains(pyromodel, xa, xs, y, num_chains, base_count)
24 nuts_kernel = NUTS(pyromodel)
25 mcmc = MCMC(nuts_kernel, num_samples=num_samples, warmup_steps=burnin)
---> 26 mcmc.run(xa, xs, y)
27 hmc_sample_chains['chain_{}'.format(idx)]={k: v.detach().cpu().numpy() for k, v in mcmc.get_samples().items()}
28
~/miniconda3/envs/pytorch_x86/lib/python3.8/site-packages/pyro/poutine/messenger.py in _context_wrap(context, fn, *args, **kwargs)
10 def _context_wrap(context, fn, *args, **kwargs):
11 with context:
---> 12 return fn(*args, **kwargs)
13
14
~/miniconda3/envs/pytorch_x86/lib/python3.8/site-packages/pyro/infer/mcmc/api.py in run(self, *args, **kwargs)
385 # requires_grad", which happens with `jit_compile` under PyTorch 1.7
386 args = [arg.detach() if torch.is_tensor(arg) else arg for arg in args]
--> 387 for x, chain_id in self.sampler.run(*args, **kwargs):
388 if num_samples[chain_id] == 0:
389 num_samples[chain_id] += 1
~/miniconda3/envs/pytorch_x86/lib/python3.8/site-packages/pyro/infer/mcmc/api.py in run(self, *args, **kwargs)
167 logger = initialize_logger(logger, "", progress_bar)
168 hook_w_logging = _add_logging_hook(logger, progress_bar, self.hook)
--> 169 for sample in _gen_samples(self.kernel, self.warmup_steps, self.num_samples, hook_w_logging,
170 i if self.num_chains > 1 else None,
171 *args, **kwargs):
~/miniconda3/envs/pytorch_x86/lib/python3.8/site-packages/pyro/infer/mcmc/api.py in _gen_samples(kernel, warmup_steps, num_samples, hook, chain_id, *args, **kwargs)
116 yield {k: v.shape for k, v in params.items()}
117 for i in range(warmup_steps):
--> 118 params = kernel.sample(params)
119 hook(kernel, params, 'Warmup [{}]'.format(chain_id) if chain_id is not None else 'Warmup', i)
120 for i in range(num_samples):
~/miniconda3/envs/pytorch_x86/lib/python3.8/site-packages/pyro/infer/mcmc/nuts.py in sample(self, params)
347 direction = int(direction.item())
348 if direction == 1: # go to the right, start from the right leaf of current tree
--> 349 new_tree = self._build_tree(z_right, r_right, z_right_grads, log_slice,
350 direction, tree_depth, energy_current)
351 # update leaf for the next doubling process
~/miniconda3/envs/pytorch_x86/lib/python3.8/site-packages/pyro/infer/mcmc/nuts.py in _build_tree(self, z, r, z_grads, log_slice, direction, tree_depth, energy_current)
224 r = half_tree.r_left
225 z_grads = half_tree.z_left_grads
--> 226 other_half_tree = self._build_tree(z, r, z_grads, log_slice,
227 direction, tree_depth-1, energy_current)
228
~/miniconda3/envs/pytorch_x86/lib/python3.8/site-packages/pyro/infer/mcmc/nuts.py in _build_tree(self, z, r, z_grads, log_slice, direction, tree_depth, energy_current)
200 def _build_tree(self, z, r, z_grads, log_slice, direction, tree_depth, energy_current):
201 if tree_depth == 0:
--> 202 return self._build_basetree(z, r, z_grads, log_slice, direction, energy_current)
203
204 # build the first half of tree
~/miniconda3/envs/pytorch_x86/lib/python3.8/site-packages/pyro/infer/mcmc/nuts.py in _build_basetree(self, z, r, z_grads, log_slice, direction, energy_current)
174 def _build_basetree(self, z, r, z_grads, log_slice, direction, energy_current):
175 step_size = self.step_size if direction == 1 else -self.step_size
--> 176 z_new, r_new, z_grads, potential_energy = velocity_verlet(
177 z, r, self.potential_fn, self.mass_matrix_adapter.kinetic_grad, step_size, z_grads=z_grads)
178 r_new_unscaled = self.mass_matrix_adapter.unscale(r_new)
~/miniconda3/envs/pytorch_x86/lib/python3.8/site-packages/pyro/ops/integrator.py in velocity_verlet(z, r, potential_fn, kinetic_grad, step_size, num_steps, z_grads)
28 r_next = r.copy()
29 for _ in range(num_steps):
---> 30 z_next, r_next, z_grads, potential_energy = _single_step_verlet(z_next,
31 r_next,
32 potential_fn,
~/miniconda3/envs/pytorch_x86/lib/python3.8/site-packages/pyro/ops/integrator.py in _single_step_verlet(z, r, potential_fn, kinetic_grad, step_size, z_grads)
51 z[site_name] = z[site_name] + step_size * r_grads[site_name] # z(n+1)
52
---> 53 z_grads, potential_energy = potential_grad(potential_fn, z)
54 for site_name in r:
55 r[site_name] = r[site_name] + 0.5 * step_size * (-z_grads[site_name]) # r(n+1)
~/miniconda3/envs/pytorch_x86/lib/python3.8/site-packages/pyro/ops/integrator.py in potential_grad(potential_fn, z)
73 node.requires_grad_(True)
74 try:
---> 75 potential_energy = potential_fn(z)
76 # deal with singular matrices
77 except RuntimeError as e:
~/miniconda3/envs/pytorch_x86/lib/python3.8/site-packages/pyro/infer/mcmc/util.py in _potential_fn(self, params)
258 params_constrained = {k: self.transforms[k].inv(v) for k, v in params.items()}
259 cond_model = poutine.condition(self.model, params_constrained)
--> 260 model_trace = poutine.trace(cond_model).get_trace(*self.model_args,
261 **self.model_kwargs)
262 log_joint = self.trace_prob_evaluator.log_prob(model_trace)
~/miniconda3/envs/pytorch_x86/lib/python3.8/site-packages/pyro/poutine/trace_messenger.py in get_trace(self, *args, **kwargs)
185 Calls this poutine and returns its trace instead of the function's return value.
186 """
--> 187 self(*args, **kwargs)
188 return self.msngr.get_trace()
~/miniconda3/envs/pytorch_x86/lib/python3.8/site-packages/pyro/poutine/trace_messenger.py in __call__(self, *args, **kwargs)
163 args=args, kwargs=kwargs)
164 try:
--> 165 ret = self.fn(*args, **kwargs)
166 except (ValueError, RuntimeError) as e:
167 exc_type, exc_value, traceback = sys.exc_info()
~/miniconda3/envs/pytorch_x86/lib/python3.8/site-packages/pyro/poutine/messenger.py in _context_wrap(context, fn, *args, **kwargs)
10 def _context_wrap(context, fn, *args, **kwargs):
11 with context:
---> 12 return fn(*args, **kwargs)
13
14
~/miniconda3/envs/pytorch_x86/lib/python3.8/site-packages/pyro/poutine/messenger.py in _context_wrap(context, fn, *args, **kwargs)
10 def _context_wrap(context, fn, *args, **kwargs):
11 with context:
---> 12 return fn(*args, **kwargs)
13
14
~/miniconda3/envs/pytorch_x86/lib/python3.8/site-packages/pyro/poutine/messenger.py in _context_wrap(context, fn, *args, **kwargs)
10 def _context_wrap(context, fn, *args, **kwargs):
11 with context:
---> 12 return fn(*args, **kwargs)
13
14
<ipython-input-2-c4fe5d87bc6e> in DogsModel(x_avoidance, x_shocked, y)
23 beta = pyro.sample("beta", dist.Normal(0., 316))
24 with pyro.plate("data"):
---> 25 pyro.sample("obs", dist.Bernoulli(torch.exp(alpha*x_avoidance + beta * x_shocked)), obs=y)
KeyboardInterrupt:
hmc_sample_chains holds sampled MCMC values as {"Chain_0": {alpha [-0.20020795, -0.1829252, -0.18054989 . .,], "beta": {}. .,}, "Chain_1": {alpha [-0.20020795, -0.1829252, -0.18054989 . .,], "beta": {}. .,}. .}
4. Diagnosing model fitΒΆ
Model fit diagnosis consists of briefly obtaining core statistical values from sampled outputs and assess the convergence of various chains from the output, before moving onto inferencing or evaluating predictive power of model.
Following plots Parameter vs. Chain matrix and optionally saves the dataframe.
beta_chain_matrix_df = pd.DataFrame(hmc_sample_chains)
# beta_chain_matrix_df.to_csv("dogs_log_regression_hmc_sample_chains.csv", index=False)
beta_chain_matrix_df
| chain_0 | chain_1 | chain_2 | chain_3 | |
|---|---|---|---|---|
| alpha | [-0.20020795, -0.1829252, -0.18054989, -0.1801... | [-0.2095339, -0.19068532, -0.18717553, -0.1877... | [-0.18193315, -0.20047516, -0.19793099, -0.201... | [-0.20323507, -0.19557746, -0.2024616, -0.1950... |
| beta | [-0.004143771, -0.0073412987, -0.0047744326, -... | [-0.007274771, -0.006299746, -0.0069501437, -0... | [-0.004794343, -0.004884478, -0.009914721, -0.... | [-0.010064903, -0.0046538757, -0.0057075787, -... |
Key statistic results as dataframe
Following method maps the values of required statistics, given a list of statistic names
all_metric_func_map = lambda metric, vals: {"mean":np.mean(vals), "std":np.std(vals),
"25%":np.quantile(vals, 0.25),
"50%":np.quantile(vals, 0.50),
"75%":np.quantile(vals, 0.75)}.get(metric)
Following outputs the summary of required statistics such as "mean", "std", "Q(0.25)", "Q(0.50)", "Q(0.75)", given a list of statistic names
key_metrics= ["mean", "std", "25%", "50%", "75%"]
summary_stats_df_= pd.DataFrame()
for metric in key_metrics:
final_di = {}
for column in beta_chain_matrix_df.columns:
params_per_column_di = dict(beta_chain_matrix_df[column].apply(lambda x: all_metric_func_map(metric, x)))
final_di[column]= params_per_column_di
metric_df_= pd.DataFrame(final_di)
metric_df_["parameter"]= metric
summary_stats_df_= pd.concat([summary_stats_df_, metric_df_], axis=0)
summary_stats_df_.reset_index(inplace=True)
summary_stats_df_.rename(columns= {"index":"metric"}, inplace=True)
summary_stats_df_.set_index(["parameter", "metric"], inplace=True)
summary_stats_df_
| chain_0 | chain_1 | chain_2 | chain_3 | ||
|---|---|---|---|---|---|
| parameter | metric | ||||
| mean | alpha | -0.193345 | -0.193843 | -0.194168 | -0.193662 |
| beta | -0.007653 | -0.007508 | -0.007655 | -0.007687 | |
| std | alpha | 0.010443 | 0.010499 | 0.010202 | 0.010418 |
| beta | 0.002364 | 0.002513 | 0.002516 | 0.002539 | |
| 25% | alpha | -0.199481 | -0.201243 | -0.200697 | -0.200732 |
| beta | -0.009039 | -0.009009 | -0.009176 | -0.009291 | |
| 50% | alpha | -0.193267 | -0.193366 | -0.193824 | -0.193818 |
| beta | -0.007366 | -0.007287 | -0.007387 | -0.007382 | |
| 75% | alpha | -0.186467 | -0.187007 | -0.187279 | -0.186131 |
| beta | -0.006055 | -0.005658 | -0.005817 | -0.005855 |
Obtain 5 point Summary statistics (mean, Q1-Q4, Std, ) as tabular data per chain and save the dataframe.
fit_df = pd.DataFrame()
for chain, values in hmc_sample_chains.items():
param_df = pd.DataFrame(values)
param_df["chain"]= chain
fit_df= pd.concat([fit_df, param_df], axis=0)
fit_df.to_csv("data/dogs_classification_hmc_samples.csv", index=False)
fit_df
| alpha | beta | chain | |
|---|---|---|---|
| 0 | -0.200208 | -0.004144 | chain_0 |
| 1 | -0.182925 | -0.007341 | chain_0 |
| 2 | -0.180550 | -0.004774 | chain_0 |
| 3 | -0.180133 | -0.010565 | chain_0 |
| 4 | -0.180133 | -0.010565 | chain_0 |
| ... | ... | ... | ... |
| 1245 | -0.190923 | -0.009148 | chain_3 |
| 1246 | -0.186947 | -0.003398 | chain_3 |
| 1247 | -0.187828 | -0.004569 | chain_3 |
| 1248 | -0.187828 | -0.004569 | chain_3 |
| 1249 | -0.196676 | -0.014982 | chain_3 |
4250 rows Γ 3 columns
# Use/Uncomment following once the results from pyro sampling operation are saved offline
# fit_df= pd.read_csv("data/dogs_classification_hmc_samples.csv")
fit_df
| alpha | beta | chain | |
|---|---|---|---|
| 0 | -0.200208 | -0.004144 | chain_0 |
| 1 | -0.182925 | -0.007341 | chain_0 |
| 2 | -0.180550 | -0.004774 | chain_0 |
| 3 | -0.180133 | -0.010565 | chain_0 |
| 4 | -0.180133 | -0.010565 | chain_0 |
| ... | ... | ... | ... |
| 1245 | -0.190923 | -0.009148 | chain_3 |
| 1246 | -0.186947 | -0.003398 | chain_3 |
| 1247 | -0.187828 | -0.004569 | chain_3 |
| 1248 | -0.187828 | -0.004569 | chain_3 |
| 1249 | -0.196676 | -0.014982 | chain_3 |
4250 rows Γ 3 columns
Following outputs the similar summary of required statistics such as "mean", "std", "Q(0.25)", "Q(0.50)", "Q(0.75)", But in a slightly different format, given a list of statistic names**
summary_stats_df_2= pd.DataFrame()
for param in ["alpha", "beta"]:
for name, groupdf in fit_df.groupby("chain"):
groupdi = dict(groupdf[param].describe())
values = dict(map(lambda key:(key, [groupdi.get(key)]), ['mean', 'std', '25%', '50%', '75%']))
values.update({"parameter": param, "chain":name})
summary_stats_df= pd.DataFrame(values)
summary_stats_df_2= pd.concat([summary_stats_df_2, summary_stats_df], axis=0)
summary_stats_df_2.set_index(["parameter", "chain"], inplace=True)
summary_stats_df_2
| mean | std | 25% | 50% | 75% | ||
|---|---|---|---|---|---|---|
| parameter | chain | |||||
| alpha | chain_0 | -0.193345 | 0.010448 | -0.199481 | -0.193267 | -0.186467 |
| chain_1 | -0.193843 | 0.010504 | -0.201243 | -0.193366 | -0.187007 | |
| chain_2 | -0.194168 | 0.010208 | -0.200697 | -0.193824 | -0.187279 | |
| chain_3 | -0.193662 | 0.010422 | -0.200732 | -0.193818 | -0.186131 | |
| beta | chain_0 | -0.007653 | 0.002365 | -0.009039 | -0.007366 | -0.006055 |
| chain_1 | -0.007508 | 0.002514 | -0.009009 | -0.007287 | -0.005658 | |
| chain_2 | -0.007655 | 0.002518 | -0.009176 | -0.007387 | -0.005817 | |
| chain_3 | -0.007687 | 0.002540 | -0.009291 | -0.007382 | -0.005855 |
Following plots sampled parameters values as Boxplots with M parameters side by side on x axis for each of the N chains
parameters= ["alpha", "beta"]# All parameters for given model
chains= fit_df["chain"].unique()# Number of chains sampled for given model
func_all_params_per_chain = lambda param, chain: (param, fit_df[fit_df["chain"]==chain][param].tolist())
func_all_chains_per_param = lambda chain, param: (f'{chain}', fit_df[param][fit_df["chain"]==chain].tolist())
di_all_params_per_chain = dict(map(lambda param: func_all_params_per_chain(param, "chain_0"), parameters))
di_all_chains_per_param = dict(map(lambda chain: func_all_chains_per_param(chain, "beta"), chains))
def plot_parameters_for_n_chains(chains=["chain_0"], parameters=["beta0", "beta1", "beta2", "beta3", "sigma"], plotting_cap=[4, 5], plot_interactive=False):
"""
Input
--------
chains: list of valid chain names, example - ["chain_0"].
parameters: list of valid parameters names, example -["beta0", "beta1", "beta2", "beta3", "sigma"].
plotting_cap: list of Cap on number of chains & Cap on number of parameters to plot, example- [4, 5]
means cap the plotting of number of chains upto 4 & number of parameters upto 5 ONLY,
If at all the list size for Chains & parameters passed increases.
plot_interactive: Flag for using Plotly if True, else Seaborn plots for False.
output
-------
Plots box plots for each chain from list of chains with parameters on x axis.
"""
try:
chain_cap, param_cap = plotting_cap#
assert len(chains)<=chain_cap, "Cannot plot Number of chains greater than %s!"%chain_cap
assert len(parameters)<=param_cap, "Cannot plot Number of parameters greater than %s!"%param_cap
for chain in chains:
di_all_params_per_chain = dict(map(lambda param: func_all_params_per_chain(param, chain), parameters))
df_all_params_per_chain = pd.DataFrame(di_all_params_per_chain)
if df_all_params_per_chain.empty:
# raise Exception("Invalid chain number in context of model!")
print("Note: Chain number [%s] is Invalid in context of this model!"%chain)
continue
if plot_interactive:
df_all_params_per_chain= df_all_params_per_chain.unstack().reset_index(level=0)
df_all_params_per_chain.rename(columns={"level_0":"parameters", 0:"values"}, inplace=True)
fig = px.box(df_all_params_per_chain, x="parameters", y="values")
fig.update_layout(height=600, width=900, title_text=f'{chain}')
fig.show()
else:
df_all_params_per_chain.plot.box()
plt.title(f'{chain}')
except Exception as error:
if type(error) is AssertionError:
print("Note: %s"%error)
chains = np.random.choice(chains, chain_cap, replace=False)
parameters=np.random.choice(parameters, param_cap, replace=False)
plot_parameters_for_n_chains(chains, parameters)
else: print("Error: %s"%error)
Pass the list of M parameters and list of N chains, with plot_interactive as True or False to choose between Plotly or Seaborn
# Use plot_interactive=False for Normal seaborn plots offline
plot_parameters_for_n_chains(chains=['chain_0', 'chain_1', 'chain_2', 'chain_3'], parameters=parameters, plot_interactive=True)
Following plots the joint distribution of pair of each parameter sampled values for all chains
all_combination_params = list(itertools.combinations(parameters, 2))
for param_combo in all_combination_params:
param1, param2= param_combo
print("\nPyro -- %s"%(f'{param1} Vs. {param2}'))
sns.jointplot(data=fit_df, x=param1, y=param2, hue= "chain")
plt.title(f'{param1} Vs. {param2}')
plt.show()
Pyro -- alpha Vs. beta
Following plots the Pairplot distribution of each parameter with every other parameterβs sampled values
sns.pairplot(data=fit_df, hue= "chain");
Following intakes the list of parameters say ["alpha", "beta"] and plots hexbins for each interaction pair for all possible combinations of parameters alpha & beta.
def hexbin_plot(x, y, x_label, y_label):
"""
Input
-------
x: Pandas series or list of values to plot on x axis.
y: Pandas series or list of values to plot on y axis.
x_label: variable name x label.
y_label: variable name x label.
Output
-------
Plot Hexbin correlation density plots for given values.
"""
fig = plt.figure()
ax = fig.add_subplot(1, 1, 1)
min_x = min(list(x)+list(y)) - 0.1
max_x = max(list(x)+list(y)) + 0.1
ax.plot([min_x, max_x], [min_x, max_x])
ax.set_xlim([min_x, max_x])
ax.set_ylim([min_x, max_x])
ax.set_title('{} vs. {} correlation scatterplot'.format(x_label, y_label))
hbin= ax.hexbin(x, y, gridsize=25, mincnt=1, cmap=plt.cm.Reds)
cb = fig.colorbar(hbin, ax=ax)
cb.set_label('occurence_density')
plt.ylabel(y_label)
plt.xlabel(x_label)
plt.show()
def plot_interaction_hexbins(fit_df, parameters=["alpha", "beta"]):
"""
Input
-------
fit_df: Pandas dataframe containing sampled values across columns with parameter names as column headers
parameters: List of parameters for which all combination of hexbins are to be plotted, defaults to ["alpha", "beta"]
Output
-------
Plots hexbin correlation density plots for each pair of parameter combination.
"""
all_combination_params = list(itertools.combinations(parameters, 2))
for param1, param2 in all_combination_params:#Plots interaction between each of two parameters
hexbin_plot(fit_df[param1], fit_df[param2], param1, param2)
Here parameters ["alpha", "beta"] are passed to plot all possible interaction pair Hexbin plots in between
plot_interaction_hexbins(fit_df, parameters=parameters)
5. Model evaluation: Posterior predictive checksΒΆ
Posterior predictive checking helps examine the fit of a model to real data, as the parameter drawn for simulating conditions & regions of interests come from the posterior distribution.
Pick samples from one particular chain of HMC samples say chain_3
for chain, samples in hmc_sample_chains.items():
samples= dict(map(lambda param: (param, torch.tensor(samples.get(param))), samples.keys()))# np array to tensors
print(chain, "Sample count: ", len(samples["alpha"]))
chain_0 Sample count: 1000
chain_1 Sample count: 1100
chain_2 Sample count: 900
chain_3 Sample count: 1250
Plot density for parameters from chain_3 to visualise the spread of sample values from that chain
title= "parameter distribution for : %s"%(chain)
fig = ff.create_distplot(list(map(lambda x:x.numpy(), samples.values())), list(samples.keys()))
fig.update_layout(title=title, xaxis_title="parameter values", yaxis_title="density", legend_title="parameters")
fig.show()
print("Alpha Q(0.5) :%s | Beta Q(0.5) :%s"%(torch.quantile(samples["alpha"], 0.5), torch.quantile(samples["beta"], 0.5)))
Alpha Q(0.5) :tensor(-0.1938) | Beta Q(0.5) :tensor(-0.0074)
Plot density & contours for both parameters from chain_3 to visualise the joint distribution & region of interest
fit_df = pd.DataFrame()
for chain, values in hmc_sample_chains.items():
param_df = pd.DataFrame(values)
param_df["chain"]= chain
fit_df= pd.concat([fit_df, param_df], axis=0)
#Choosing samples from chain 3
chain_samples_df= fit_df[fit_df["chain"]==chain].copy()# chain is 'chain_3'
alpha= chain_samples_df["alpha"].tolist()
beta= chain_samples_df["beta"].tolist()
colorscale = ['#7A4579', '#D56073', 'rgb(236,158,105)', (1, 1, 0.2), (0.98,0.98,0.98)]
fig = ff.create_2d_density(alpha, beta, colorscale=colorscale, hist_color='rgb(255, 255, 150)', point_size=4, title= "alpha beta joint density plot")
fig.update_layout( xaxis_title="x (alpha)", yaxis_title="y (beta)")
fig.show()
Note: The distribution of alpha values are significantly offset to the left from beta values, by almost 13 times; Thus for any given input observation of avoidances or shocked, the likelihood of getting shocked is more influenced by small measure of avoidance than by getting shocked.
Observations:
On observing the spread of alpha & beta values, the parameter beta being less negative & closer to zero can be interpreted as learning ability, i.e., the ability of dog to learn from shock experiences. The increase in number of shocks barely raises the probability of non-avoidance (value of ππ) with little amount. Unless the trials & shocks increase considerably large in progression, it doesnβt mellow down well and mostly stays around 0.9.
Whereas its not the case with alpha, alpha is more negative & farthest from zero. It imparts a significant decline in non-avoidance (ππ) even for few instances where dog avoids the shock; therefore alpha can be interpreted as retention ability i.e., the ability to retain the learning from previous shock experiences.
print(chain_samples_df["alpha"].describe(),"\n\n", chain_samples_df["beta"].describe())
count 1250.000000
mean -0.193662
std 0.010422
min -0.230810
25% -0.200732
50% -0.193818
75% -0.186131
max -0.162166
Name: alpha, dtype: float64
count 1250.000000
mean -0.007687
std 0.002540
min -0.016617
25% -0.009291
50% -0.007382
75% -0.005855
max -0.001585
Name: beta, dtype: float64
From the contour plot above following region in posterior distribution seems highly plausible for parameters:
For alpha,
-0.2 < alpha < -0.19For beta
-0.0075 < beta < -0.0055
Following selects all the pairs of alpha, beta values between the range mentioned above.
select_sample_df= chain_samples_df[(chain_samples_df["alpha"]<-0.19)&(chain_samples_df["alpha"]>-0.2)&(chain_samples_df["beta"]<-0.0075)&(chain_samples_df["beta"]<-0.0055)]
# print(select_sample_df.set_index(["alpha", "beta"]).index)
print("Count of alpha-beta pairs of interest, from mid region with high desnity in contour plot above (-0.2 < alpha < -0.19, -0.0075 < beta < -0.0055): ", select_sample_df.shape[0])
select_sample_df.head(3)
Count of alpha-beta pairs of interest, from mid region with high desnity in contour plot above (-0.2 < alpha < -0.19, -0.0075 < beta < -0.0055): 201
| alpha | beta | chain | |
|---|---|---|---|
| 3 | -0.195004 | -0.009725 | chain_3 |
| 4 | -0.194500 | -0.008064 | chain_3 |
| 39 | -0.192702 | -0.007621 | chain_3 |
Picking a case of 3 trials with Y [0,1,1], i.e. Dog is shocked in 1st, Dogs avoids in 2nd & thereafter, effectively having an experience of 1 shock & 1 avoidance. Considering all values of alpha & beta in range -0.2 < alpha < -0.19, -0.0075 < beta < -0.0055
Y_li= []
Y_val_to_param_dict= defaultdict(list)
# Value -0.2 < alpha < -0.19, -0.0075 < beta < -0.0055
for rec in select_sample_df.iterrows():# for -0.2 < alpha < -0.19, -0.0075 < beta < -0.0055
a,b = float(rec[1]["alpha"]), float(rec[1]["beta"])
res= round(np.exp(a+b), 4)
Y_li.append(res)
Y_val_to_param_dict[res].append((round(a,5),round(b,5)))# Sample-- {0.8047: [(-0.18269378, -0.034562342), (-0.18383412, -0.033494473)], 0.8027: [(-0.18709463, -0.03263992), (-0.18464606, -0.035114493)]}
In above Y_val_to_param is a dictionary that holds value \(\exp^{\alpha +\beta}\) as key and tuple of corresponding \((\alpha, \beta)\) as value.
The following plots the histogram of \(\exp^{\alpha +\beta}\) values obtained as an interaction of selected \(\alpha\) and \(\beta\) values from region of interest.
Y_for_select_sample_df = pd.DataFrame({"Y_for -0.2 < alpha < -0.19 & -0.0075 < beta < -0.0055": Y_li})
fig = px.histogram(Y_for_select_sample_df, x= "Y_for -0.2 < alpha < -0.19 & -0.0075 < beta < -0.0055")
title= "observed values distribution for params Y_for -0.2 < alpha < -0.19 & -0.0075 < beta < -0.0055"
fig.update_layout(title=title, xaxis_title="observed values", yaxis_title="count", legend_title="dogs")
fig.show()
print("Mean: %s | Median: %s"%(np.mean(Y_li), np.quantile(Y_li, 0.5)))
print("Sorted observed values: \n", sorted(Y_li))
Mean: 0.8144228855721394 | Median: 0.8143
Sorted observed values:
[0.8073, 0.8074, 0.8076, 0.8089, 0.8092, 0.8092, 0.8094, 0.8096, 0.8096, 0.8101, 0.8101, 0.8102, 0.8103, 0.8103, 0.8103, 0.8104, 0.8106, 0.8107, 0.8107, 0.8107, 0.8108, 0.8109, 0.8109, 0.811, 0.8112, 0.8112, 0.8113, 0.8115, 0.8115, 0.8116, 0.8118, 0.8118, 0.8119, 0.8119, 0.812, 0.812, 0.8121, 0.8121, 0.8121, 0.8122, 0.8122, 0.8122, 0.8122, 0.8122, 0.8123, 0.8123, 0.8124, 0.8124, 0.8124, 0.8124, 0.8124, 0.8125, 0.8126, 0.8126, 0.8126, 0.8128, 0.8128, 0.8128, 0.8129, 0.8129, 0.813, 0.8131, 0.8131, 0.8131, 0.8131, 0.8132, 0.8133, 0.8133, 0.8133, 0.8133, 0.8133, 0.8133, 0.8134, 0.8134, 0.8134, 0.8134, 0.8134, 0.8135, 0.8135, 0.8136, 0.8136, 0.8136, 0.8136, 0.8137, 0.8138, 0.8139, 0.8139, 0.814, 0.814, 0.814, 0.814, 0.814, 0.814, 0.8141, 0.8141, 0.8141, 0.8141, 0.8142, 0.8142, 0.8143, 0.8143, 0.8143, 0.8144, 0.8144, 0.8144, 0.8145, 0.8145, 0.8147, 0.8147, 0.8147, 0.8148, 0.8149, 0.8149, 0.815, 0.815, 0.815, 0.815, 0.815, 0.8151, 0.8151, 0.8151, 0.8151, 0.8152, 0.8152, 0.8152, 0.8153, 0.8153, 0.8154, 0.8154, 0.8156, 0.8156, 0.8157, 0.8157, 0.8157, 0.8157, 0.8158, 0.8159, 0.816, 0.816, 0.8161, 0.8162, 0.8162, 0.8162, 0.8163, 0.8163, 0.8163, 0.8163, 0.8164, 0.8165, 0.8165, 0.8165, 0.8166, 0.8166, 0.8166, 0.8166, 0.8166, 0.8167, 0.8167, 0.8168, 0.8169, 0.8169, 0.8169, 0.817, 0.817, 0.8171, 0.8171, 0.8172, 0.8172, 0.8172, 0.8172, 0.8173, 0.8173, 0.8174, 0.8174, 0.8174, 0.8175, 0.8175, 0.8175, 0.8177, 0.8178, 0.8179, 0.8181, 0.8182, 0.8182, 0.8183, 0.8185, 0.8187, 0.8187, 0.8189, 0.8189, 0.819, 0.8192, 0.8192, 0.8192, 0.8193, 0.8193, 0.8196, 0.8196, 0.8197, 0.8197, 0.8202]
For given experiment of 3 trials, from all the Ys with corresponding alpha-beta pairs of interest, pick 3 lower most values of Y for instance; Thus selecting its corresponding alpha-beta pairs
Note: Can add multiple observed values from histogram for comparison.
Corresponding to lowest_obs values of Y, obtain select_pairs as list of correspoding alpha, beta pairs from Y_val_to_param_dict.
lowest_obs = sorted(Y_li)[:3]#[0.8085, 0.8094, 0.8095]# Pick values from above histogram range or sorted list
selected_pairs= list(itertools.chain.from_iterable(map(lambda obs: Y_val_to_param_dict.get(obs), lowest_obs)))
selected_pairs
[(-0.19852, -0.0156), (-0.19746, -0.01654), (-0.19804, -0.01568)]
Following stores a dictionary of observed y values for pair of alpha-beta parameters
def get_obs_y_dict(select_pairs, x_a, x_s):
"""
Input
-------
select_pairs: pairs of (alpha, beta) selected
x_a: array holding avoidance count for all dogs & all trials, example for 30 dogs & 25 trials, shaped (30, 25)
x_s: array holding shock count for all dogs & all trials, example for 30 dogs & 25 trials, shaped (30, 25)
Output
-------
Outputs a dictionary with tuple of alpha, beta as key & observerd values of y corresponding to alpha, beta in key
"""
y_dict = {}
for alpha, beta in select_pairs:# pair of alpha, beta
y_dict[(alpha, beta)] = torch.exp(alpha*x_a + beta* x_s)
return y_dict
obs_y_dict= get_obs_y_dict(selected_pairs, x_avoidance, x_shocked)
print("Alpha-beta pair values as Keys to access corresponding array of inferred observations: \n", list(obs_y_dict.keys()))
Alpha-beta pair values as Keys to access corresponding array of inferred observations:
[(-0.19852, -0.0156), (-0.19746, -0.01654), (-0.19804, -0.01568)]
Following plots scatterplots of observed y values for all 30 dogs for each alpha-beta pair of interest
def plot_observed_y_given_parameters(observations_list, selected_pairs_list, observed_y, chain, original_obs= []):
"""
Input
-------
observations_list:list of observated 'y' values from simulated 3 trials experiment computed corresponding
to selected pairs of (alpha, beta)
selected_pairs_list: list of alpha, beta pair tuples, example : [(-0.225, -0.01272), (-0.21844, -0.01442)]
observed_y: dict holding observed values correspodning to pair of alpha, beta tuple as key,
example: {(-0.225, -0.01272): tensor([[1.0000, 0.9874,..]])}
chain: name of the chain from sampler
original_obs: original observations/ labels from given data
returns plotly scatter plots with number of trials on X axis & corresponding probability of getting
shocked for each pair of (alpha, beta) passed in 'selected_pairs_list'.
Output
--------
Plots scatter plot of all observed values of y corresponding to each given pair of alpha, beta
"""
obs_column_names = [f'Dog_{ind+1}'for ind in range(dogs_data["Ndogs"])]
for record in zip(observations_list, selected_pairs_list):
sim_y, select_pair = record
print("\nFor simulated y value: %s & Selected pair: %s"%(sim_y, select_pair))
obs_y_df= pd.DataFrame(observed_y[select_pair].numpy().T, columns=obs_column_names)
if not original_obs is plot_observed_y_given_parameters.__defaults__[0]:
original_obs_column_names = list(map(lambda name:f'*{name}', obs_column_names))
original_obs_df= pd.DataFrame(original_obs.numpy().T, columns=original_obs_column_names)
obs_y_df= pd.concat([obs_y_df, original_obs_df], axis=1)
print("Note: Legend *Dog_X corresponds to 'y' i.e.,original observation values")
obs_y_title= "Observed values distribution for all dogs given parameter %s from %s"%(select_pair, chain)
fig = px.scatter(obs_y_df, title=obs_y_title)
fig.update_layout(title=obs_y_title, xaxis_title="Trials", yaxis_title="Probability of shock at trial j (ππ)", legend_title="Dog identifier")
fig.show()
Also Optionally pass the True observed y values to original_obs argument for all 30 dogs to plot alongside the observed y from alpha-beta pairs of interest.
Note: True observed y are marked with legends in format *Dog_X
plot_observed_y_given_parameters(lowest_obs, selected_pairs, obs_y_dict, chain, original_obs= y)
For simulated y value: 0.8073 & Selected pair: (-0.19852, -0.0156)
Note: Legend *Dog_X corresponds to 'y' i.e.,original observation values
For simulated y value: 0.8074 & Selected pair: (-0.19746, -0.01654)
Note: Legend *Dog_X corresponds to 'y' i.e.,original observation values
For simulated y value: 0.8076 & Selected pair: (-0.19804, -0.01568)
Note: Legend *Dog_X corresponds to 'y' i.e.,original observation values
Following plots a single scatterplots for comparison of observed y values for all alpha-beta pairs of interest from dense region in contourplot above, that is -0.2 < alpha < -0.19, -0.0075 < beta < -0.0055
def compare_dogs_given_parameters(pairs_to_compare, observed_y, original_obs=[], alpha_by_beta_dict= {}):
"""
Input
--------
pairs_to_compare: list of alpha, beta pair tuples to compare,
example : [(-0.225, -0.0127), (-0.218, -0.0144)]
observed_y: dict holding observed values correspodning to pair of alpha,
beta tuple as key, example: {(-0.225, -0.01272): tensor([[1.0000, 0.9874,..]])}
alpha_by_beta_dict: holds alpha, beta pair tuples as keys & alpha/beta as value, example:
{(-0.2010, -0.0018): 107.08}
Output
--------
returns a plotly scatter plot with number of trials on X axis & corresponding probability of getting
shocked for each pair of (alpha, beta) passed for comparison.
"""
combined_pairs_obs_df= pd.DataFrame()
title_txt = ""
additional_txt = ""
obs_column_names = [f'Dog_{ind+1}'for ind in range(dogs_data["Ndogs"])]
for i, select_pair in enumerate(pairs_to_compare):
i+=1
title_txt+=f'Dog_X_m_{i} corresponds to {select_pair}, '
obs_column_names_model_x =list(map(lambda name:f'{name}_m_{i}', obs_column_names))
if alpha_by_beta_dict:
additional_txt+=f'πΌ/π½ for Dog_X_m_{i} {round(alpha_by_beta_dict.get(select_pair), 2)}, '
obs_y_df= pd.DataFrame(observed_y[select_pair].numpy().T, columns=obs_column_names_model_x)
combined_pairs_obs_df= pd.concat([combined_pairs_obs_df, obs_y_df], axis=1)
print(title_txt)
print("\n%s"%additional_txt)
if not original_obs is compare_dogs_given_parameters.__defaults__[0]:
original_obs_column_names = list(map(lambda name:f'*{name}', obs_column_names))
original_obs_df= pd.DataFrame(original_obs.numpy().T, columns=original_obs_column_names)
combined_pairs_obs_df= pd.concat([combined_pairs_obs_df, original_obs_df], axis=1)
print("\nNote: Legend *Dog_X_ corresponds to 'y' i.e.,original observation values")
obs_y_title= "Observed values for all dogs given parameter for a chain"
fig = px.scatter(combined_pairs_obs_df, title=obs_y_title)
fig.update_layout(title=obs_y_title, xaxis_title="Trials", yaxis_title="Probability of shock at trial j (ππ)", legend_title="Dog identifier")
fig.show()
Also Optionally pass the True observed y values to original_obs argument for all 30 dogs to plot alongside the observed y from alpha-beta pairs of interest.
Note: True observed y are marked with legends in format *Dog_X
compare_dogs_given_parameters(selected_pairs, obs_y_dict, original_obs= y)
Dog_X_m_1 corresponds to (-0.19852, -0.0156), Dog_X_m_2 corresponds to (-0.19746, -0.01654), Dog_X_m_3 corresponds to (-0.19804, -0.01568),
Note: Legend *Dog_X_ corresponds to 'y' i.e.,original observation values
Observations: The 3 individual scatter plots above correspond to 3 most optimum alpha-beta pairs from 3rd quadrant of contour plot drawn earlier; Also the scatterplot following them faciliates comparing obeserved y values for all 3 pairs at once:
Data for almost all dogs in the experiment favours m1 parameters (-0.19852, -0.0156), over m3 & m2; With exceptions of Dog 6, 7 showing affinity to m3 parameters (-0.19804, -0.01568), over m2 & m1 at all levels of 30 trials.
Plotting observed values y corresponding to pairs of alpha-beta with with mean, minmum, maximum value of \(\frac{alpha}{beta}\)
Following computes \(\frac{alpha}{beta}\) for each pair of alpha, beta and outputs pairs with mean, maximum & minimum values; that can therefore be marked on a single scatterplots for comparison of observed y values for all alpha-beta pairs of interest
def get_alpha_by_beta_records(chain_df, metrics=["max", "min", "mean"]):
"""
Input
--------
chain_df: dataframe holding sampled parameters for a given chain
returns an alpha_by_beta_dictionary with alpha, beta pair tuples as keys & alpha/beta as value,
example: {(-0.2010, -0.0018): 107.08}
Output
-------
Return a dictionary with values corresponding to statistics/metrics asked in argument metrics, computed
over alpha_by_beta column of passed dataframe.
"""
alpha_beta_dict= {}
chain_df["alpha_by_beta"] = chain_df["alpha"]/chain_df["beta"]
min_max_values = dict(chain_df["alpha_by_beta"].describe())
alpha_beta_list= list(map(lambda key: chain_df[chain_df["alpha_by_beta"]<=min_max_values.get(key)].sort_values(["alpha_by_beta"]).iloc[[-1]].set_index(["alpha", "beta"])["alpha_by_beta"].to_dict(), metrics))
[alpha_beta_dict.update(element) for element in alpha_beta_list];
return alpha_beta_dict
alpha_by_beta_dict = get_alpha_by_beta_records(chain_samples_df, metrics=["max", "min", "mean"])# outputs a dict of type {(-0.2010, -0.0018): 107.08}
print("Alpha-beta pair with value as alpha/beta: ", alpha_by_beta_dict)
alpha_by_beta_selected_pairs= list(alpha_by_beta_dict.keys())
alpha_by_beta_obs_y_dict = get_obs_y_dict(alpha_by_beta_selected_pairs, x_avoidance, x_shocked)# Outputs observed_values for given (alpha, beta)
Alpha-beta pair with value as alpha/beta: {(-0.1841791868209839, -0.0015849750488996506): 116.20320892333984, (-0.19745661318302155, -0.016540180891752243): 11.937995910644531, (-0.18423298001289368, -0.006490856874734163): 28.383460998535156}
Following is the scatter plot for observed y values corresponding to pairs of alpha, beta yielding minimum, maximum & mean value for \(\frac{alpha}{beta}\).
Note: The y i.e., original observations are simultaneously plotted side by side.
compare_dogs_given_parameters(alpha_by_beta_selected_pairs, alpha_by_beta_obs_y_dict,
original_obs= y, alpha_by_beta_dict= alpha_by_beta_dict)
Dog_X_m_1 corresponds to (-0.1841791868209839, -0.0015849750488996506), Dog_X_m_2 corresponds to (-0.19745661318302155, -0.016540180891752243), Dog_X_m_3 corresponds to (-0.18423298001289368, -0.006490856874734163),
πΌ/π½ for Dog_X_m_1 116.2, πΌ/π½ for Dog_X_m_2 11.94, πΌ/π½ for Dog_X_m_3 28.38,
Note: Legend *Dog_X_ corresponds to 'y' i.e.,original observation values
Observations: The scatter plots above corresponds to 3 pairs of alpha-beta values from contour plot drawn earlier, which correspond to maxmimum, minimum & mean value of πΌ/π½. Plot faciliates comparing obeserved y values for all pairs with True observed y at once:
1. Data for for first 7 dogs in the experiment favours m1 parameters (-0.184, -0.0015) with highest πΌ/π½ around 116, followed by m3 & m2 at all levels of 30 trials. Avoidance learning in these 7 dogs is captured suitablely by model 2 but most of the instances for which they are shocked, are modelled well with m1 parameters.
2. Data for for rest 23 dogs in the experiment showed affinity for m2 parameters (-0.197, -0.016) with lowest πΌ/π½ around 11, followed by m3 & m1 at all levels of 30 trials; Likewise Avoidance learning in these 23 dogs is captured suitablely by model 2 but most of the instances for which they are shocked, are modelled well with m1 parameters only.
3. Data for Dogs 18-20 fits model 2 increasingly well after 10th trial; Whereas for Dogs 21-30 model 2 parameters fit the original data exceptionally well after 6th trial only.
6. Model ComparisonΒΆ
Compare Dogs model with Normal prior & Uniform prior using Deviance Information Criterion (DIC)
DIC is computed as follows
\(D(\alpha,\beta) = -2\ \sum_{i=1}^{n} \log P\ (y_{i}\ /\ \alpha,\beta)\)
\(\log P\ (y_{i}\ /\ \alpha,\beta)\) is the log likehood of shocks/avoidances observed given parameter \(\alpha,\beta\), this expression expands as follows:
Using \(D(\alpha,\beta)\) to Compute DICΒΆ
\(\overline D(\alpha,\beta) = \frac{1}{T} \sum_{t=1}^{T} D(\alpha,\beta)\)
\(\overline \alpha = \frac{1}{T} \sum_{t=1}^{T}\alpha_{t}\\\) \(\overline \beta = \frac{1}{T} \sum_{t=1}^{T}\beta_{t}\)
\(D(\overline\alpha,\overline\beta) = -2\ \sum_{i=1}^{30}[ y_{i}\ (\overline\alpha Xa_{i}\ +\overline\beta\ Xs_{i}) + \ (1-y_{i})\log\ (1\ -\ e^{(\overline\alpha Xa_{i}\ +\overline\beta\ Xs_{i})})]\)
Therefore finally $\( DIC\ =\ 2\ \overline D(\alpha,\beta)\ -\ D(\overline\alpha,\overline\beta) \)$
Following method computes deviance value given parameters alpha & beta
def calculate_deviance_given_param(parameters, x_avoidance, x_shocked, y):
"""
Input
-------
parameters : dictionary containing sampled values of parameters alpha & beta
x_avoidance: tensor holding avoidance count for all dogs & all trials, example for
30 dogs & 25 trials, shaped (30, 25)
x_shocked: tensor holding shock count for all dogs & all trials, example for 30 dogs
& 25 trials, shaped (30, 25).
y: tensor holding response for all dogs & trials, example for 30 dogs
& 25 trials, shaped (30, 25).
Output
-------
Computes deviance as D(Bt)
D(Bt) : Summation of log likelihood / conditional probability of output,
given param 'Bt' over all the 'n' cases.
Returns deviance value for a pair for parameters, alpha & beta.
"""
D_bt_ = []
p = parameters["alpha"]*x_avoidance + parameters["beta"]*x_shocked# alpha * Xai + beta * Xsi
p=p.double()
p= torch.where(p<-0.0001, p, -0.0001).float()
Pij_vec = p.flatten().unsqueeze(1)# shapes (750, 1)
Yij_vec= y.flatten().unsqueeze(0)# shapes (1, 750)
# D_bt = -2 * Summation_over_i-30 (yi.(alpha.Xai + beta.Xsi)+ (1-yi).log (1- e^(alpha.Xai + beta.Xsi)))
D_bt= torch.mm(Yij_vec, Pij_vec) + torch.mm(1-Yij_vec, torch.log(1- torch.exp(Pij_vec)))
D_bt= -2*D_bt.squeeze().item()
return D_bt
def calculate_mean_deviance(samples, x_avoidance, x_shocked, y):
"""
Input
-------
samples : dictionary containing mean of sampled values of parameters alpha & beta.
x_avoidance: tensor holding avoidance count for all dogs & all trials, example for
30 dogs & 25 trials, shaped (30, 25).
x_shocked: tensor holding shock count for all dogs & all trials, example for 30 dogs
& 25 trials, shaped (30, 25).
y: tensor holding response for all dogs & trials, example for 30 dogs
& 25 trials, shaped (30, 25).
Output
-------
Computes mean deviance as D(Bt)_bar
D(Bt)_bar: Average of D(Bt) values calculated for each
Bt (Bt is a single param value from chain of samples)
Returns mean deviance value for a pair for parameters, alpha & beta.
"""
samples_count = list(samples.values())[0].size()[0]
all_D_Bts= []
for index in range(samples_count):# pair of alpha, beta
samples_= dict(map(lambda param: (param, samples.get(param)[index]), samples.keys()))
D_Bt= calculate_deviance_given_param(samples_, x_avoidance, x_shocked, y)
all_D_Bts.append(D_Bt)
D_Bt_mean = torch.mean(torch.tensor(all_D_Bts))
D_Bt_mean =D_Bt_mean.squeeze().item()
return D_Bt_mean
Following method computes deviance information criterion for a given bayesian model & chains of sampled parameters alpha & beta
def DIC(sample_chains, x_avoidance, x_shocked, y):
"""
Input
-------
sample_chains : dictionary containing multiple chains of sampled values, with chain name as
key and sampled values of parameters alpha & beta.
x_avoidance: tensor holding avoidance count for all dogs & all trials, example for
30 dogs & 25 trials, shaped (30, 25).
x_shocked: tensor holding shock count for all dogs & all trials, example for 30 dogs
& 25 trials, shaped (30, 25).
y: tensor holding response for all dogs & trials, example for 30 dogs
& 25 trials, shaped (30, 25).
Output
-------
Computes DIC as follows
D_mean_parameters: π·(πΌ_bar,π½_bar), Summation of log likelihood / conditional probability of output,
given average of each param πΌ, π½, over 's' samples, across all the 'n' cases.
D_Bt_mean: π·(πΌ,π½)_bar, Summation of log likelihood / conditional probability of output,
given param πΌ, π½, across all the 'n' cases.
π·πΌπΆ is computed as π·πΌπΆ = 2 π·(πΌ,π½)_bar β π·(πΌ_bar,π½_bar)
returns Deviance Information Criterion for a chain alpha & beta sampled values.
"""
dic_list= []
for chain, samples in sample_chains.items():
samples= dict(map(lambda param: (param, torch.tensor(samples.get(param))), samples.keys()))# np array to tensors
mean_parameters = dict(map(lambda param: (param, torch.mean(samples.get(param))), samples.keys()))
D_mean_parameters = calculate_deviance_given_param(mean_parameters, x_avoidance, x_shocked, y)
D_Bt_mean = calculate_mean_deviance(samples, x_avoidance, x_shocked, y)
dic = round(2* D_Bt_mean - D_mean_parameters,3)
dic_list.append(dic)
print(". . .DIC for %s: %s"%(chain, dic))
print("\n. .Mean Deviance information criterion for all chains: %s\n"%(round(np.mean(dic_list), 3)))
def compare_DICs_given_model(x_avoidance, x_shocked, y, **kwargs):
"""
Input
--------
x_avoidance: tensor holding avoidance count for all dogs & all trials, example for
30 dogs & 25 trials, shaped (30, 25).
x_shocked: tensor holding shock count for all dogs & all trials, example for 30 dogs
& 25 trials, shaped (30, 25).
y: tensor holding response for all dogs & trials, example for 30 dogs
& 25 trials, shaped (30, 25).
kwargs: dict of type {"model_name": sample_chains_dict}
Output
--------
Compares Deviance Information Criterion value for a multiple bayesian models.
"""
for model_name, sample_chains in kwargs.items():
print("%s\n\nFor model : %s"%("_"*30, model_name))
DIC(sample_chains, x_avoidance, x_shocked, y)
Define alternate model with different prior such as uniform distribution
The following model is defined in the same manner using Pyro as per the following expression of generative model for this dataset, just with modification of prior distribution to Uniform rather than Normal as follows:
\(\pi_j\) ~ \(bern\ (\exp \ (\alpha.XAvoidance + \beta.XShocked)\ )\), \(prior\ \alpha\) ~ \(U(0., 316.)\), \(\beta\) ~ \(U(0., 316.)\)
# Dogs model with uniform prior
def DogsModelUniformPrior(x_avoidance, x_shocked, y):
"""
Input
-------
x_avoidance: tensor holding avoidance count for all dogs & all trials, example for
30 dogs & 25 trials, shaped (30, 25)
x_shocked: tensor holding shock count for all dogs & all trials, example for 30 dogs
& 25 trials, shaped (30, 25).
y: tensor holding response for all dogs & trials, example for 30 dogs
& 25 trials, shaped (30, 25).
Output
--------
Implements pystan model: {
alpha ~ uniform(0.0, 316.2);
beta ~ uniform(0.0, 316.2);
for(dog in 1:Ndogs)
for (trial in 2:Ntrials)
y[dog, trial] ~ bernoulli(exp(alpha * xa[dog, trial] + beta * xs[dog, trial]));}
"""
alpha = pyro.sample("alpha", dist.Uniform(-10, -0.00001))
beta = pyro.sample("beta", dist.Uniform(-10, -0.00001))
with pyro.plate("data"):
pyro.sample("obs", dist.Bernoulli(torch.exp(alpha*x_avoidance + beta * x_shocked)), obs=y)
hmc_sample_chains_uniform_prior= get_hmc_n_chains(DogsModelUniformPrior, x_avoidance, x_shocked, y, num_chains=4, base_count = 900)
Sample: 100%|ββββββββββ| 1650/1650 [00:07, 218.08it/s, step size=8.93e-01, acc. prob=0.920]
Sample: 100%|ββββββββββ| 1350/1350 [00:09, 143.45it/s, step size=3.54e-01, acc. prob=0.959]
Sample: 100%|ββββββββββ| 1200/1200 [00:05, 208.96it/s, step size=9.27e-01, acc. prob=0.915]
Sample: 100%|ββββββββββ| 1250/1250 [00:05, 213.63it/s, step size=8.44e-01, acc. prob=0.923]
Total time: 28.582163095474243
compute & compare deviance information criterion for a multiple bayesian models
compare_DICs_given_model(x_avoidance, x_shocked, y, Dogs_normal_prior= hmc_sample_chains, Dogs_uniform_prior= hmc_sample_chains_uniform_prior)
______________________________
For model : Dogs_normal_prior
. . .DIC for chain_0: 1183.11
. . .DIC for chain_1: 1183.415
. . .DIC for chain_2: 1183.277
. . .DIC for chain_3: 1183.429
. .Mean Deviance information criterion for all chains: 1183.308
______________________________
For model : Dogs_uniform_prior
. . .DIC for chain_0: 1183.34
. . .DIC for chain_1: 1183.387
. . .DIC for chain_2: 1183.115
. . .DIC for chain_3: 1183.385
. .Mean Deviance information criterion for all chains: 1183.307